feat(bloom): aggregate bloom filters #2340

sistemd · 2024-10-29T11:47:59Z

This PR adds code for aggregate bloom filters. It doesn't do anything useful with the code, but under a conditional flag it will compare the results obtained from aggregate bloom filters with results obtained from regular bloom filters to ensure that the implementation is correct.

The bit matrix is implemented manually because somehow it's 10x faster this way than using the bitvec crate and also it's extremely simple. I have a hunch that we could get even more improvements if we used u32 instead of u8 to back the bit matrix. But this can be done later.

The next PR will implement storing these aggregate filters in the DB, and then the final PR will implement actually using them from the RPC calls (and finally reaping the benefits).

sistemd · 2024-10-30T16:32:08Z

I have a hunch that we could get even more improvements if we used u32 instead of u8 to back the bit matrix. But this can be done later.

However, the problem with this is that we still store/load Vec<u8> from the DB, and converting that to Vec<u32> safely requires copying. So it may actually turn out to be slower.

CHr15F0x · 2024-10-31T10:04:38Z

I have a hunch that we could get even more improvements if we used u32 instead of u8 to back the bit matrix. But this can be done later.

However, the problem with this is that we still store/load Vec<u8> from the DB, and converting that to Vec<u32> safely requires copying. So it may actually turn out to be slower.

If the copying happens only once it shouldn't have any impact.
You could even try usize to match the word width.

sistemd · 2024-10-31T10:54:38Z

You could even try usize to match the word width.

Good idea!

CHr15F0x · 2024-10-31T12:24:18Z

crates/storage/src/bloom.rs

@@ -13,18 +75,185 @@ use crate::ReorgCounter;
 // filter.
 pub const EVENT_KEY_FILTER_LIMIT: usize = 16;

+/// An aggregate of all Bloom filters for a given range of blocks.
+/// Before being added to `AggregateBloom`, each [`BloomFilter`] is
+/// rotated by 90 degrees.


nit: I guess the correct mathematical wording would be transposition.

Suggested change

/// rotated by 90 degrees.

/// rotated by 90 degrees (transposed).

CHr15F0x

LGTM 🙇

I'm curious to see how it boosts some really slow event queries in the rpc 👀 .

sistemd requested a review from a team as a code owner October 29, 2024 11:48

sistemd force-pushed the sistemd/aggregate-bloom-filters branch from 8789370 to 39d5490 Compare October 29, 2024 11:52

CHr15F0x reviewed Oct 31, 2024

View reviewed changes

CHr15F0x approved these changes Oct 31, 2024

View reviewed changes

aggregate bloom filter

32b48ae

sistemd force-pushed the sistemd/aggregate-bloom-filters branch from 39d5490 to 32b48ae Compare November 1, 2024 22:55

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(bloom): aggregate bloom filters #2340

feat(bloom): aggregate bloom filters #2340

sistemd commented Oct 29, 2024

sistemd commented Oct 30, 2024 •

edited

Loading

CHr15F0x commented Oct 31, 2024

sistemd commented Oct 31, 2024

CHr15F0x Oct 31, 2024 •

edited

Loading

CHr15F0x left a comment

	/// rotated by 90 degrees.
	/// rotated by 90 degrees (transposed).

feat(bloom): aggregate bloom filters #2340

Are you sure you want to change the base?

feat(bloom): aggregate bloom filters #2340

Conversation

sistemd commented Oct 29, 2024

sistemd commented Oct 30, 2024 • edited Loading

CHr15F0x commented Oct 31, 2024

sistemd commented Oct 31, 2024

CHr15F0x Oct 31, 2024 • edited Loading

Choose a reason for hiding this comment

CHr15F0x left a comment

Choose a reason for hiding this comment

sistemd commented Oct 30, 2024 •

edited

Loading

CHr15F0x Oct 31, 2024 •

edited

Loading